111 research outputs found

    Supervised classification and mathematical optimization

    Get PDF
    Data Mining techniques often ask for the resolution of optimization problems. Supervised Classification, and, in particular, Support Vector Machines, can be seen as a paradigmatic instance. In this paper, some links between Mathematical Optimization methods and Supervised Classification are emphasized. It is shown that many different areas of Mathematical Optimization play a central role in off-the-shelf Supervised Classification methods. Moreover, Mathematical Optimization turns out to be extremely useful to address important issues in Classification, such as identifying relevant variables, improving the interpretability of classifiers or dealing with vagueness/noise in the data.Ministerio de Ciencia e InnovaciónJunta de Andalucí

    Semi-obnoxious location models: a global optimization approach

    Get PDF
    In the last decades there has been an increasing interest in environmental topics. This interest has been reflected in modeling the location of obnoxious facilities, as shown by the important number of papers published in this field. However, a very common drawback of the existing literature is that, as soon as environmental aspects are taken into account, economical considerations (e.g. transportation costs) are ignored, leading to models with dubious practical interest. In this paper we take into account both the environmental impact and the transportation costs caused by the location of an obnoxious facility, and propose as solution method of the well-known BSSS, with a new bounding scheme which exploits the structure of the problem.Dirección General de Investigación Científica y Técnic

    Ecologically Designed Sanitary Sewer Based on Constructed WetlandsTechnology – Case Study in Managua (Nicaragua)

    Get PDF
    In developed countries the sanitation and treatment of urban wastewater is well sustained and technically solved by means of conventional pipe networksandsubsequentcentralizedtreatments.However,developingcountries lack these infrastructures and are in need of sustainable, decentralized and economically viable solutions for the disposal of their urban wastewaters. In addition to this, there are situations where the demands of conservation of naturalspacesdonotallowintensiveconstructiveproceduresandwhichforce the implementation of sanitary engineering with less environmental impact. We present the Ecological Wastewater Sewer (EWS), an ecological urban sewerage system that simultaneously transports wastewater and improves its quality.Thisinnovativetechnologyisanalternativetoconventionalsanitation piping that has minimal environmental impact. It is based on a successful previous work for the improvement of artificial wetlands in a pilot scheme andatfull-scaleonatestsite.TheEWSisachannel-shapeddevicethatrelies on the application of two key developments: a carefully designed cornered stones layout, and the creation of a natural aeration system. This way, it acts as a separating sewage system that guarantees the presence of a chamber of circulating air within the transportation unit, favouring permanent aerobic conditions in the upper levels of the mass of wastewater. Furthermore, its capacity to set tle suspended solids allows the EWS to be used as a sedimentor in water purification processes. A real-life application of this system proved successful in the sanitation of a district of Managua (Nicaragua). Working with a 100-metre-long street of 20 one-story houses, the system is reported to still be in full operating order after six years. The conclusions and results drawn from its monitoring are meticulously explained in our paper, as well astherecommendations&guidelinesforthedesignofmoreEWSunits,with aim to the popularization of this affordable, efficient and green approach to wastewater sanitation.Andalusian International School of Water Engineering, City hall of SevilleCooperation Office at the University of Sevill

    A biobjective method for sample allocation in stratified sampling

    Get PDF
    The two main and contradicting criteria guiding sampling design are accuracy of estimators and sampling costs. In stratified random sampling, the sample size must be allocated to strata in order to optimize both objectives. In this note we address, following a biobjective methodology, this allocation problem. A two-phase method is proposed to describe the set of Pareto-optimal solutions of this nonlinear integer biobjective problem. In the first phase, all supported Pareto-optimal solutions are described via a closed formula, which enables quick computation. Moreover, for the common case in which sampling costs are independent of the strata, all Pareto-optimal solutions are shown to be supported. For more general cost structures, the non-supported Pareto-optimal solutions are found by solving a parametric knapsack problem. Bounds on the criteria can also be imposed, directing the search towards implementable sampling plans. Our method provides a deeper insight into the problem than simply solving a scalarized version, whereas the computational burden is reasonable.Ministerio de Ciencia y Tecnologí

    Visualizing proportions and dissimilarities by space-filling maps: a large neighborhood search approach

    Get PDF
    In this paper we address the problem of visualizing a set of individuals, which have attached a statistical value given as a proportion, and a dissimilarity measure. Each individual is represented as a region within the unit square, in such a way that the area of the regions represent the proportions and the distances between them represent the dissimilarities. To enhance the interpretability of the representation, the regions are required to satisfy two properties. First, they must form a partition of the unit square, namely, the portions in which it is divided must cover its area without overlapping. Second, the portions must be made of a connected union of rectangles which verify the so-called box-connectivity constraints, yielding a visualization map called Space-filling Box-connected Map (SBM). The construction of an SBM is formally stated as a mathematical optimization problem, which is solved heuristically by using the Large Neighborhood Search technique. The methodology proposed in this paper is applied to three real-world datasets: the first one concerning financial markets in Europe and Asia, the second one about the letters in the English alphabet, and finally the provinces of The Netherlands as a geographical application.Ministerio de Economía y CompetitividadJunta de AndalucíaEuropean Regional Development Fun

    Heuristic approaches for support vector machines with the ramp loss

    Get PDF
    Recently, Support Vector Machines with the ramp loss (RLM) have attracted attention from the computational point of view. In this technical note, we propose two heuristics, the first one based on solving the continuous relaxation of a Mixed Integer Nonlinear formulation of the RLM and the second one based on the training of an SVM classifier on a reduced dataset identified by an integer linear problem. Our computational results illustrate the ability of our heuristics to handle datasets of much larger size than those previously addressed in the literature.Ministerio de Economía y CompetitividadJunta de AndalucíaEuropean Regional Development Fund

    Multi-group support vector machines with measurement costs a biobjective approach

    Get PDF
    Support Vector Machine has shown to have good performance in many practical classification settings. In this paper we propose, for multi-group classification, a biobjective optimization model in which we consider not only the generalization ability (modelled through the margin maximization), but also costs associated with the features. This cost is not limited to an economical payment, but can also refer to risk, computational effort, space requirements, etc. We introduce a biobjective mixed integer problem, for which Pareto optimal solutions are obtained. Those Pareto optimal solutions correspond to different classification rules, among which the user would choose the one yielding the most appropriate compromise between the cost and the expected misclassification rate.Ministerio de Ciencia y TecnologíaPlan Andaluz de Investigació

    Strongly agree or strongly disagree? Rating features in support vector machines

    Get PDF
    In linear classifiers, such as the Support Vector Machine (SVM), a score is associated with each feature and objects are assigned to classes based on the linear combination of the scores and the values of the features. Inspired by discrete psychometric scales, which measure the extent to which a factor is in agreement with a statement, we propose the Discrete Level Support Vector Machine (DILSVM) where the feature scores can only take on a discrete number of values, de fined by the so-called feature rating levels. The DILSVM classifier benefits from interpretability as it can be seen as a collection of Likert scales, one for each feature, where we rate the level of agreement with the positive class. To build the DILSVM classifier, we propose a Mixed Integer Linear Programming approach, as well as a collection of strategies to reduce the building times. Our computational experience shows that the 3-point and the 5-point DILSVM classifiers have comparable accuracy to the SVM with a substantial gain in interpretability and sparsity, thanks to the appropriate choice of the feature rating levels.Ministerio de Economía y CompetitividadJunta de AndalucíaFondo Europeo de Desarrollo Regiona

    Detecting relevant variables and interactions in supervised classification

    Get PDF
    The widely used Support Vector Machine (SVM) method has shown to yield good results in Supervised Classification problems. When the interpretability is an important issue, then classification methods such as Classification Trees (CART) might be more attractive, since they are designed to detect the important predictor variables and, for each predictor variable, the critical values which are most relevant for classification. However, when interactions between variables strongly affect the class membership, CART may yield misleading information. Extending previous work of the authors, in this paper an SVM-based method is introduced. The numerical experiments reported show that our method is competitive against SVM and CART in terms of misclassification rates, and, at the same time, is able to detect critical values and variables interactions which are relevant for classification.Ministerio de Educación y CienciaJunta de Andalucí

    Clustering categories in support vector machines

    Get PDF
    The support vector machine (SVM) is a state-of-the-art method in supervised classification. In this paper the Cluster Support Vector Machine (CLSVM) methodology is proposed with the aim to increase the sparsity of the SVM classifier in the presence of categorical features, leading to a gain in interpretability. The CLSVM methodology clusters categories and builds the SVM classifier in the clustered feature space. Four strategies for building the CLSVM classifier are presented based on solving: the SVM formulation in the original feature space, a quadratically constrained quadratic programming formulation, and a mixed integer quadratic programming formulation as well as its continuous relaxation. The computational study illustrates the performance of the CLSVM classifier using two clusters. In the tested datasets our methodology achieves comparable accuracy to that of the SVM in the original feature space, with a dramatic increase in sparsity.Ministerio de Economía y CompetitividadJunta de Andalucí
    corecore